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1. Introduction 

A trend in compressed sensing (CS) is to exploit struc- 
ture for improved reconstruction performance. In the 
basic CS model (i.e. the single measurement vec- 
tor model), exploiting the clustering structure among 
nonzero elements in the solution vector has drawn 
much attention, and many algorithms have been pro- 
posed such as group Lasso (Yuan & Lin, 2006). How- 
ever, few algorithms explicitly consider correlation 
within a cluster. Meanwhile, in the multiple mea- 
surement vector (MMV) model (Cotter ct al., 2005) 
correlation among multiple solution vectors is largely 
ignored. Although several recently developed algo- 
rithms consider the exploitation of the correlation, 
such as the Kalman Filtered Compressed Sensing (KF- 
CS) (Vaswani, 2008), these algorithms need to know 
a priori the correlation structure, thus limiting their 
effectiveness in practical problems. 

Recently, we developed a sparse Bayesian learning 
(SBL) algorithm, namely T-SBL, and its variants 
(Zhang & Rao, 2011a;b; 2010), which adaptively learn 
the correlation structure and exploit such correla- 
tion information to significantly improve reconstruc- 
tion performance. Here wc establish their connec- 
tions to other popular algorithms, such as the group 
Lasso, iterative rcweighted £i and £2 algorithms, and 
algorithms for time-varying sparsity. We also provide 
strategies to improve these existing algorithms. 

2. T-SBL: Exploiting Correlation in the 
MMV Model 

The MMV model is expressed as: 

Y = *X + V. (1) 

Here Y ^ [Y.i, • ■ • , Y.l] e K^^-^ is an available mea- 
surement matrix consisting of L measurement vectors. 
$ e R^^^'\N -C M) is a known dictionary matrix. 



and any N columns of $ are linearly independent. 
X = [X.i, • • • ,X.l] eR'^"'^ is an unknown and fuU 
column-rank solution matrix. A key assumption here 
is that X has only a few nonzero rows (i.e. the com- 
mon sparsity assumption (Cotter et al., 2005)). V is 
an unknown noise matrix. 

Most existing algorithms ignore the correlation struc- 
ture in each row of X. In contrast, T-SBL considers 
such correlation by assuming the joint density of each 
row vector of X to be 

p(X,.;7„B,) ~A/'(0,7,B,), i = l,--- ,M 

where 7^ is a nonnegative hyperparameter determining 
whether the i-th row X^. is zero or not. B^ is a positive 
definite matrix that captures the correlation structure 
of X,.. 

By letting y = vec(Y^) G ]R^^^\ D = * (g) II, 
X = vec(X^) e R^^^^^, and v = vec(V^), we can 
transform the MMV model (1) to the following one 

y = Dx + v, (2) 

where x is block-sparse with each block being G 
R^^-'^, i.e, x ~ Here (g) indicates the 

Kroncckcr product, and vec(-) is the vcctorization op- 
erator. 

In the SBL framework (Tipping, 2001), the T-SBL al- 
gorithm was derived as follows (Zhang & Rao, 2011a): 

x ^ (ASqI +D^D)-iD^y 

= So-£oD^(AI + DSoD^)~'dSo 

7, = iTr[B-iS^] -f iTr[xfB-ix,], Vz (3) 

g ^ 1 +x.(xO^ 
M ^ 7. 

where is the i-th principal diagonal block of size 
L X L in So is a block diagonal matrix with 
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each block given by 7iB. In this algorithm we as- 
sume Bi = B (Vi) to avoid overfitting. A is the noise 
variance, which is also estimated in T-SBL; for clarity 
we omit its learning rule here (and we also omit such 
learning rules in the following algorithms). A simpli- 
fied version, which has much less computational load, 
is also derived in (Zhang & Rao, 2011a). 

We now describe an experiment (Zhang & Rao, 2011a) 
showing that the proposed T-SBL and its simplified 
version T-MSBL have superior performance when cor- 
relation exists among the solution vectors. In the ex- 
periment the Gaussian random dictionary matrix $ 
had the size of 25 x 125, the number of nonzero rows 
of X was K = 12, and L varied from 1 to 4. The cor- 
relation among solution vectors was and 0.9 in two 
cases, respectively. Five algorithms were compared 
(for details see (Zhang & Rao, 2011a)) to T-SBL and 
T-MSBL. To avoid the disturbance of the regulariza- 
tion parameters of all the algorithms, we considered a 
noiseless case. Results (Fig.l) show that when the so- 
lution vectors are highly correlated, all the compared 
algorithms have very poor performance, due to their 
inability to exploit such correlation. 

In the following we connect T-SBL to other related 
models. 




Number ot Measurement Vectors L Number of lUleasurement Vectors L 



(a) Correlation: (b) Correlation: 0.9 

Figure 1. Failure rates of various algorithms. 

3. Connection to Iterative Reweighted 
£2 Framework in the MMV Model 

The iterative reweighted £2 minimization framework 
extended for the MMV problem (in noisy case) com- 
putes the solution at the (fc + l)-th iteration as follows 
(Wipf k Nagarajan, 2010): 

XC^+i) = argmin |i Y - *X||^ + A ^ ^ (|1X.. W,)' (4) 

i 

where w,-'^'' is the weight depending on the previous 
estimate of X. Typically q = 2 or q = 00. In 
(Zhang & Rao, 2011b) we have shown that T-SBL can 



be interpreted as an iterative reweighted £2 algorithm: 
X^^-'+i' = argmin|||Y-*X||^ + 

M 

a5:(7P)-^X,.(B(^))-1x?}. 

i=l 

The learning rules for j'l'^^ and B'^'^'' are given in 
(Zhang & Rao, 2011b). Note that X^.B'^X^ is the 
quadratic Mahalanobis distance (MD) measure of X^.. 

This interpretation reveals the potential advantage of 
T-SBL is due to using the MD measure of X^. in 
the penalty, instead of using typical £g [q = 2, 00) 
norms of Xj. (Negahban & Wainwright, 2011). By 
comparing it to M-SBL (Wipf & Rao, 2007), another 
SBL algorithm ignoring the correlation in each X^., 
we found that T-SBL applies the MD measure also on 
the weights ■ These observations motivated us 

to modify existing iterative reweighted £2 algorithms 
for better performance, as shown in (Zhang & Rao, 
2011b). 

Although a strict mathematical proof is missing, these 
empirical results suggest that the mixed norm based 
penalties as shown in (4) are not very effective for solv- 
ing the MMV problem in practice, since the unknown 
solution vectors are often correlated. 

4. Connection to Iterative Reweighted 
li Framework and Block Sparsity 
Model 

The iterative reweighted £1 minimization framework 
(Candes & et al, 2008) extended for the MMV prob- 
lem is given by (Wipf & Nagarajan, 2010) 

XC^+i) ^argniin||Y-*X||2, + A^«;f)|lX,.||,. (5) 

i 

We now connect T-SBL to this framework. 

For the model (2) the cost function to estimate all the 
hyperparameters Q = {B,7i,Vi} is: 

£(9) ^ -21ogy"p(y|x;A)p(x;7,,B„Vi)dx 

= log|AI + DSoD^|+y^(AI-f DSoD^)-V- 
Using the identity y^(AI + DSoD^)-^ = 
minx [yl|y ^ Dx||2 -l- x^S^^x] , we can upper-bound 
the above cost function as follows: 

£(x, 9) = log |AI + DSoD^I + ^||y - Dx|[2 + x^E^ ^x. 

A 

By first minimizing over each member of 9 and then 
minimizing over x, we can get the solution: 

X = argmin] ||y - Dx||2 + Ac/tc(x) k (6) 
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with the penahy defined by f/Tc(x) = 
niinB^o,-y,>o,v» {x^S-^x + log|AI + DSoD^|}. 
Using the duahty theory (Boyd & Vandenberghe, 
2004) as in (Wipf & Nagarajan, 2010), we can 
re-express the optimization problem (6) as follows: 

~^{k+i) ^ argmin ||y - Dx||2 

X 

+A^t.fyxf(BW)-ix,. (7) 

i 

The learning rules for w^'^'' and B*^*^) can be derived 
using the duality theory and the gradient method. 
Further, using the approximation in (Zhang & Rao, 
2011a) we have: 

X(fc+i) = argmin II Y-*X||3r 

+A5]^f7x,.(BW)-iX^. (8) 

i 

The learning rules for wf^^ and B'-'^^ are given by 
Wi ^ 2(L*f(AI + *r*^)"^*i)' 
B ^ lf:^,with7.^2 ^^-^"^n 9) 

where C = Ef£i7**f(^I + *r*^)-i*,; and T = 
diag(7i,-- - ,7a/)- Note that in each iteration k we 
need an inner loop to iteratively compute Wijji and 
B until convergence for a better estimate of B. The 
inner loop generally takes several iterations, and the 
whole algorithm needs very few outer-loop iterations 
to achieve its best performance (see Fig. 2). In fact, 
each iteration of (8) yields a sparse solution. 

When B*^*^) = I and no iteration was performed, the 
problem (8) reduces to the group Lasso (for the MMV 
model). When B^*"') ~ I (Vfc) and iterative reweighting 
was performed, the problem (8) is a typical iterative 
reweighted £i algorithm. Thus, T-SBL can be viewed 
as a variant of iterative reweighted £i algorithms. Sim- 
ilar to the £2 interpretation in the previous section, 
this interpretation also suggests replacing £q norms im- 
posed on Xi. by the MD measure in both the penalty 
and the weights. 

To clearly see the advantage of our suggestion, 
we conduct the same simulation as in Fig.l (b) 
when L = 4. We used the reweighted £1 ver- 
sion of T-SBL, the reweighted £1 version of M-SBL 
(which corresponds to the £1 version of T-SBL with 
B = I) (Wipf & Nagarajan, 2010), and the orig- 
inal reweighted £1 algorithm (5) with q = 2 and 
w-'^' = (||x|.'^'||2 + e)^"^. We also modified this original 




Figure 2. Performance improved when exploiting the cor- 
relation. 

reweighted £1 algorithm to exploit the correlation by 
changing the weights to : 

^ (y^xf (BW)-i(X,|"))^ + e)^\ 

where B'^'^^ can be estimated by the learning rule (9). 
But here we set B^'^^ (VA:) to be the true value. The 
result (Fig. 2) shows the algorithms are improved when 
exploiting the correlation. It is worthwhile to notice 
that the original iterative reweighted £1 algorithm is 
greatly improved after we replace the £2 norm by the 
MD measure in its weights. 

Note that the model (2) is essentially the same 
as the block sparsity model (Yuan & Lin, 2006; 
Eldar & Mishali, 2009) \ a variant of the basic CS 
model. Thus T-SBL can be applied to this model. 

5. Connection to the Time- Varying 
Sparsity Model 

The time-varying sparsity model is a natural exten- 
sion of the MMV model. It considers the case when 
the support of each column of X is time-varying. 
Several algorithms have been proposed, such as the 
Kalman Filtered Based Compressed Sensing (KF-CS) 
(Vaswani, 2008) and Least-Square Compressed Sens- 
ing (LS-CS) (Vaswani, 2010). Since this model gen- 
erally assumes the support is changing slowly, we can 
view such a time-varying sparsity model as concate- 
nation of several MMV models, where in each MMV 
model the support docs not change. Therefore, T-SBL 
can be used in this model. Note that in this model 
exploiting the multiple measurement vectors is impor- 
tant because of the enhanced support-recovery ability 
afforded by the MMV model, but unfortunately this 
strategy is missing in current approaches. 

To verify this strategy, we conduct an experiment us- 
ing KF-CS, LS-CS, T-SBL and M-SBL. The Gaussian 

^Now D is the original dictionary matrix. 
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Figure 3. Performance in a time-varying sparsity case. 

dictionary matrix was of the size 60 x 256. The column 
number of X was 50. The number of nonzero rows, K, 
during the first 15 columns of X was 15. K was in- 
creased by 10 since the 16-th and the 31-th column of 
X, respectively. But since the 26-th column 5 nonzero 
rows were set to zeros. Each nonzero row had temporal 
correlation varying from 0.7 to 0.99, and had a dura- 
tion of 20 columns (if was not set to zeros) . SNR was 
about 20 dB. KF-CS and LS-CS were fed with the true 
noise variance and the true correlation information. 
However, both T-SBL and M-SBL learned the noise 
variance. T-SBL also learned the correlation struc- 
tures. When performing T-SBL and M-SBL, we ap- 
proximated the time- varying sparsity model using two 
methods. One was using the concatenation of 25 MMV 
models, each MMV model containing 2 columns. The 
second was using 10 MMV models, each containing 
5 columns. The experiment was repeated 100 times. 
Figure 3 shows that the two MMV algorithms have 
better performance than KF-CS and LS-CS. Further- 
more, T-SBL is super to M-SBL. The experiment code 
can be downloaded from the first author's website. 

6. Conclusions 

A general methodology to capture sparsity structure 
of signals is to use combinations/hierarchy of various 
norms (Zhao et al., 2009). However, our work showed 
that another effective way is to use covariance esti- 
mation methods to learn the sparsity structures in 
the framework of SBL. Besides, we showed that it- 
erative reweighted £i and £2 algorithms for the MMV 
model and the block sparsity model can be greatly 
improved through replacing their iq norms imposed 
on the blocks/groups by the Mahalanobis distance 
measure, whose covariance matrix is learned data- 
adaptively. 
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